A New Framework for Mandarin Lvcsr Based on One-pass Decoder

نویسندگان

  • Sheng GAO
  • Bo XU
  • Taiyi HUANG
چکیده

This paper describes a new framework based on one-pass and decision tree based class-triphone acoustic modeling for Mandarin LVCSR. Compared with the multi-pass decoder, it should be more knowledgeable and efficient as all sources are used at the same time when the decoder could be well organized and optimized. We give a detail about the organization of our one-pass decoder and how to handle the search space explosion by giant number of triphone and cross-word extension dealing with unknown right context including the tone context. The experimental results show that the character error rate (CER) was reduced to 13.04% for open LM and 2.8% for close LM with non-tonal class-triphone model based on the male test database from China National Hi-Tech Project 863. And with tonal class-triphone model, CER reaches 10.31% and has a 21% relative character error reduction compared with non-tonal class-triphone model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR

NLPR has been with long efforts on Mandarin speech recognition. This paper reports our recent process in this field with several significant novel characteristics: 1) Very large speech databases are used to learn more robust acoustic model; 2) Acoustic model has evolved from non-tonal class-triphone to tonal class-triphone based on tone-embedded decision tree, namely unified tone & triphone mod...

متن کامل

A multi-pass error detection and correction framework for Mandarin LVCSR

We previously proposed a multi-pass framework for Large Vocabulary Continuous Speech Recognition (LVCSR). The objective of this framework is to apply sophisticated linguistic models for recognition, while maintaining a balance between complexity and efficiency. The framework is composed of three passes: initial recognition, error detection and error correction. This paper presents and evaluates...

متن کامل

Decoding-time prediction of non-verbalized punctuation

This paper presents novel methods that integrate lexical prediction of non-verbalized punctuations with Viterbi decoding for Large Vocabulary Conversational Speech Recognition (LVCSR) in a single pass. We describe two different approaches one based on a modified finite state machine representation of language models and one based on an extension of an LVCSR decoder. We discuss advantages over t...

متن کامل

Development of Cslu Lvcsr: the 1997 Darpa Hub4 Evaluation System

This paper presents the CSLU Broadcast News transcription system used in the DARPA 1997 evaluation. The system was built using the softwares developed for the CSLU LVCSR project started in January 1997. This 25K-word vocabulary system used continuous HMMs for acoustic modeling and the standard backo trigram as the language model. The search used a single pass decoder with MLLR based adaptation ...

متن کامل

Hierarchical processing of the modulation spectrum for GALE Mandarin LVCSR system

This paper aims at investigating the use of TANDEM features based on hierarchical processing of the modulation spectrum. The study is done in the framework of the GALE project for recognition of Mandarin Broadcast data. We describe the improvements obtained using the hierarchical processing and the addition of features like pitch and short-term critical band energy. Results are consistent with ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000